3 research outputs found

    On Identifying Critical Nuggets Of Information During Classification Task

    Get PDF
    In large databases, there may exist critical nuggets - small collections of records or instances that contain domain-specific important information. This information can be used for future decision making such as labeling of critical, unlabeled data records and improving classification results by reducing false positive and false negative errors. In recent years, data mining efforts have focussed on pattern and outlier detection methods. However, not much effort has been dedicated to finding critical nuggets within a data set. This work introduces the idea of critical nuggets, proposes an innovative domain-independent method to measure criticality, suggests a heuristic to reduce the search space for finding critical nuggets, and isolates and validates critical nuggets from some real world data sets. It seems that only a few subsets may qualify to be critical nuggets, underlying the importance of finding them. The proposed methodology can detect them. This work also identifies certain properties of critical nuggets and provides experimental validation of the properties. Critical nuggets were then applied to 2 important classification task related performance metrics - classification accuracy and misclassification costs. Experimental results helped validate that critical nuggets can assist in improving classification accuracies in real world data sets when compared with other standalone classification algorithms. The improvements in accuracy using the critical nuggets were statistically significant. Extensive studies were also undertaken on real world data sets that utilized critical nuggets to help minimize misclassification costs. In this case as well the critical nuggets based approach yielded statistically significant, lower misclassification costs than than standalone classification methods

    Modeling Hydroclimatic Change in Southwest Louisiana Rivers

    No full text
    We applied the newly developed WRF-Hydro model to investigate the hydroclimatic trend encompassing the three basins in Southwest Louisiana as well as their connection with large-scale atmospheric drivers. Using the North American Land Data Assimilation System Phase 2 (NLDAS-2), we performed a multi-decadal model hindcast covering the period of 1979–2014. After validating the model’s performance against available observations, trend and wavelet analysis were applied on the time series of hydroclimatic variables from NLDAS-2 (temperature and precipitation) and model results (evapotranspiration, soil moisture, water surplus, and streamflow). Trend analysis of model-simulated monthly and annual time series indicates that the regional climate is warming and drying over the past decades, specifically during spring and summer (growing season). Wavelet analysis reveals that, since the late 1990s, the anomaly of evapotranspiration, soil moisture, and streamflow exhibits high coherency with that of precipitation. Pettitt’s test detects a possible change-point around the year 2004, after which the monthly precipitation decreased from 140 to 120 mm, evapotranspiration slightly increased from 80 to 83 mm, and water surplus decreased from 60 to 38 mm. Changes in regional climate conditions are closely correlated with large-scale climate dynamics such as the Atlantic Multidecadal Oscillation (AMO) and El Niño Southern Oscillation (ENSO)
    corecore